Goto

Collaborating Authors

 Marin County



Gavin Newsom Is Playing the Long Game

The New Yorker

He catches nascent changes in the political weather. "During early, he kept telling me, 'Crime--there's something here,' " DeBoo told me. DeBoo studied the latest crime statistics and saw nothing unusual. He brushed off the worry. Then new numbers came out, showing a large pandemic spike in shoplifting and car theft, and concerns about crime exploded into the headlines. Last March, judging the winds, Newsom launched a podcast, "This Is Gavin Newsom."


Beyond the Black Box: A Cognitive Architecture for Explainable and Aligned AI

Keyi, Hu

arXiv.org Artificial Intelligence

Current AI paradigms, as "architects of experience," face fundamental challenges in explainability and value alignment. This paper introduces "Weight-Calculatism," a novel cognitive architecture grounded in first principles, and demonstrates its potential as a viable pathway toward Artificial General Intelligence (AGI). The architecture deconstructs cognition into indivisible Logical Atoms and two fundamental operations: Pointing and Comparison. Decision-making is formalized through an interpretable Weight-Calculation model (Weight = Benefit * Probability), where all values are traceable to an auditable set of Initial Weights. This atomic decomposition enables radical explainability, intrinsic generality for novel situations, and traceable value alignment. We detail its implementation via a graph-algorithm-based computational engine and a global workspace workflow, supported by a preliminary code implementation and scenario validation. Results indicate that the architecture achieves transparent, human-like reasoning and robust learning in unprecedented scenarios, establishing a practical and theoretical foundation for building trustworthy and aligned AGI.



DiscoTrack: A Multilingual LLM Benchmark for Discourse Tracking

Bu, Lanni, Levine, Lauren, Zeldes, Amir

arXiv.org Artificial Intelligence

Recent LLM benchmarks have tested models on a range of phenomena, but are still focused primarily on natural language understanding for extraction of explicit information, such as QA or summarization, with responses often targeting information from individual sentences. We are still lacking more challenging, and importantly also multilingual, benchmarks focusing on implicit information and pragmatic inferences across larger documents in the context of discourse tracking: integrating and aggregating information across sentences, paragraphs and multiple speaker utterances. To this end, we present DiscoTrack, an LLM benchmark targeting a range of tasks across 12 languages and four levels of discourse understanding: salience recognition, entity tracking, discourse relations and bridging inference. Our evaluation shows that these tasks remain challenging, even for state-of-the-art models.


A Foundational Theory of Quantitative Abstraction: Adjunctions, Duality, and Logic for Probabilistic Systems

Anwer, Nivar, López-Rubio, Ezequiel, Elizondo, David, Luque-Baena, Rafael M.

arXiv.org Artificial Intelligence

The analysis and control of stochastic dynamical systems rely on probabilistic models such as (continuous-space) Markov decision processes, but large or continuous state spaces make exact analysis intractable and call for principled quantitative abstraction. This work develops a unified theory of such abstraction by integrating category theory, coalgebra, quantitative logic, and optimal transport, centred on a canonical $\varepsilon$-quotient of the behavioral pseudo-metric with a universal property: among all abstractions that collapse behavioral differences below $\varepsilon$, it is the most detailed, and every other abstraction achieving the same discounted value-loss guarantee factors uniquely through it. Categorically, a quotient functor $Q_\varepsilon$ from a category of probabilistic systems to a category of metric specifications admits, via the Special Adjoint Functor Theorem, a right adjoint $R_\varepsilon$, yielding an adjunction $Q_\varepsilon \dashv R_\varepsilon$ that formalizes a duality between abstraction and realization; logically, a quantitative modal $μ$-calculus with separate reward and transition modalities is shown, for a broad class of systems, to be expressively complete for the behavioral pseudo-metric, with a countable fully abstract fragment suitable for computation. The theory is developed coalgebraically over Polish spaces and the Giry monad and validated on finite-state models using optimal-transport solvers, with experiments corroborating the predicted contraction properties and structural stability and aligning with the theoretical value-loss bounds, thereby providing a rigorous foundation for quantitative state abstraction and representation learning in probabilistic domains.


Validation of collision-free spheres of Stewart-Gough platforms for constant orientations using the Application Programming Interface of a CAD software

Patra, Bibekananda, Chittawadigi, Rajeevlochana G., Bandyopadhyay, Sandipan

arXiv.org Artificial Intelligence

This paper presents a method of validation of the size of the largest collision-free sphere (CFS) of a 6-6 Stewart-Gough platform manipulator (SGPM) for a given orientation of its moving platform (MP) using the Application Programming Interface (API) of a CAD software. The position of the MP is updated via the API in an automated manner over a set of samples within a shell enclosing the surface of the CFS. For each pose of the manipulator, each pair of legs is investigated for mutual collisions. The CFS is considered safe or validated iff none of the points falling inside the CFS lead to a collision between any pair of legs. This approach can not only validate the safety of a precomputed CFS, but also estimate the same for any spatial parallel manipulator.


Tensor Logic: The Language of AI

Domingos, Pedro

arXiv.org Machine Learning

Progress in AI is hindered by the lack of a programming language with all the requisite features. Libraries like PyTorch and TensorFlow provide automatic differentiation and efficient GPU implementation, but are additions to Python, which was never intended for AI. Their lack of support for automated reasoning and knowledge acquisition has led to a long and costly series of hacky attempts to tack them on. On the other hand, AI languages like LISP and Prolog lack scalability and support for learning. This paper proposes tensor logic, a language that solves these problems by unifying neural and symbolic AI at a fundamental level. The sole construct in tensor logic is the tensor equation, based on the observation that logical rules and Einstein summation are essentially the same operation, and all else can be reduced to them. I show how to elegantly implement key forms of neural, symbolic and statistical AI in tensor logic, including transformers, formal reasoning, kernel machines and graphical models. Most importantly, tensor logic makes new directions possible, such as sound reasoning in embedding space. This combines the scalability and learnability of neural networks with the reliability and transparency of symbolic reasoning, and is potentially a basis for the wider adoption of AI.


Understanding the Repeat Curse in Large Language Models from a Feature Perspective

Yao, Junchi, Yang, Shu, Xu, Jianhua, Hu, Lijie, Li, Mengdi, Wang, Di

arXiv.org Artificial Intelligence

Large language models (LLMs) have made remarkable progress in various domains, yet they often suffer from repetitive text generation, a phenomenon we refer to as the "Repeat Curse". While previous studies have proposed decoding strategies to mitigate repetition, the underlying mechanism behind this issue remains insufficiently explored. In this work, we investigate the root causes of repetition in LLMs through the lens of mechanistic interpretability. Inspired by recent advances in Sparse Autoencoders (SAEs), which enable monosemantic feature extraction, we propose a novel approach, "Duplicatus Charm", to induce and analyze the Repeat Curse. Our method systematically identifies "Repetition Features" -the key model activations responsible for generating repetitive outputs. First, we locate the layers most involved in repetition through logit analysis. Next, we extract and stimulate relevant features using SAE-based activation manipulation. To validate our approach, we construct a repetition dataset covering token and paragraph level repetitions and introduce an evaluation pipeline to quantify the influence of identified repetition features. Furthermore, by deactivating these features, we have effectively mitigated the Repeat Curse. The source code of our work is publicly available at: https://github.com/kaustpradalab/repeat-curse-llm